Skip to content

Conversation

@jerboaa
Copy link
Contributor

@jerboaa jerboaa commented Nov 19, 2025

Please review this simple fix to jlink's --release-info plugin to handle non-ASCII in vendor strings. The JDK build uses UTF-8 encoding for the produced release file that is being passed to jlink at build-time via the --release-info plugin. However, the plugin internally uses java.util.Properties.load(InputStream) API which assumes ISO-8859-1 encoding of the input stream. The proposed fix is to use the java.util.Prorperties.load(Reader) API instead and pass it a Reader with UTF-8 encoding (using nio API).

Testing:

  • GHA
  • test/jdk/tools/jlink tests including the new reg-test which fails prior and passes after the fix.

Thoughts?


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8372155: ReleaseInfoPlugin doesn't handle input file as UTF-8 properly (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28399/head:pull/28399
$ git checkout pull/28399

Update a local copy of the PR:
$ git checkout pull/28399
$ git pull https://git.openjdk.org/jdk.git pull/28399/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28399

View PR using the GUI difftool:
$ git pr show -t 28399

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28399.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 19, 2025

👋 Welcome back sgehwolf! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 19, 2025

@jerboaa This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8372155: ReleaseInfoPlugin doesn't handle input file as UTF-8 properly

Reviewed-by: alanb, jpai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 75 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Nov 19, 2025

@jerboaa The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@jerboaa jerboaa marked this pull request as ready for review November 20, 2025 09:16
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 20, 2025
@mlbridge
Copy link

mlbridge bot commented Nov 20, 2025

Webrevs

@jaikiran
Copy link
Member

jaikiran commented Nov 20, 2025

Hello Severin, would it better to (even) specify the expectations of the --release-info jlink plugin and what encoding it expects for the file content?

I might be wrong, but even with the use of UTF-8 for reading the given file (like proposed in this PR), it may not guarantee that it is the right Charset to use for that file.

@jerboaa
Copy link
Contributor Author

jerboaa commented Nov 20, 2025

Thanks for the review.

Hello Severin, would it better to (even) specify the expectations of the --release-info jlink plugin and what encoding it expects for the file content?

IIUYC, you'd like to have the help text amended for --release-info to say it expects the input file in UTF-8? Is that what you are saying?

IMO it would be best for --release-info to not only require a file as an option but also the encoding. It could perhaps default to UTF-8. @AlanBateman Any thoughts on this?

I might be wrong, but even with the use of UTF-8 for reading the given file (like proposed in this PR), it may not guarantee that it is the right Charset to use for that file.

True. If somebody passes an ISO-8859-1 (or any non-utf8) encoded file to --release-info it would break for any non-ASCII characters.

@jaikiran
Copy link
Member

jaikiran commented Nov 20, 2025

Hello Severin, would it better to (even) specify the expectations of the --release-info jlink plugin and what encoding it expects for the file content?

IIUYC, you'd like to have the help text amended for --release-info to say it expects the input file in UTF-8? Is that what you are saying?

Correct, but not necessarily UTF-8. I am not familiar where all this release-info gets used during jlink and then in the generated image. So whatever encoding is appropriate for such content, I think we should specify it, in the help text and man page of jlink.

@jerboaa
Copy link
Contributor Author

jerboaa commented Nov 20, 2025

Correct, but not necessarily UTF-8.

It expects the input file to be UTF-8 now. So I've added that the plugin help text in 2f48c81

I am not familiar where all this release-info gets used during jlink and then in the generated image.

It's the plugin that creates the $JDK/release file. That's it.

So whatever encoding is appropriate for such content, I think we should specify it, in the help text and man page of jlink.

I've added it to the --list-plugin help text now. Plugins aren't mentioned in the jlink man page, so it doesn't seem appropriate to mention it the jlink man page. Here is how it looks like when using jlink --list-plugins with the latest update:

  --release-info <file>|add:<key1>=<value1>:<key2>=<value2>:...|del:<key list>
                            <file> option is to load release properties from
                            the supplied file. The specified file is expected
                            to be encoded in UTF-8.
                            add: is to add properties to the release file.
                            Any number of <key>=<value> pairs can be passed.
                            del: is to delete the list of keys in release file.

@AlanBateman
Copy link
Contributor

IMO it would be best for --release-info to not only require a file as an option but also the encoding. It could perhaps default to UTF-8. @AlanBateman Any thoughts on this?

It's very likely that tools that read the release file aren't using Properties API so I think having it UTF-8 is best.

@jerboaa
Copy link
Contributor Author

jerboaa commented Nov 20, 2025

IMO it would be best for --release-info to not only require a file as an option but also the encoding. It could perhaps default to UTF-8. @AlanBateman Any thoughts on this?

It's very likely that tools that read the release file aren't using Properties API so I think having it UTF-8 is best.

The output $JDK/release file will be UTF-8 since the ReleaseInfoPlugin uses PrintWriter(BytearrayOutputStream) to write the content bytes to the ResourcePoolEntry. That in-turn uses UTF-8 since JEP 400 (by means of Charset.defaultCharset().

Question is if it's worth supporting arbitrary input encondings for --release-info <file> usages. <file> is currently in UTF-8 in the JDK build, so went with that expectation. But arguably it could be any other encoding a user chooses. So to make it generic, it's conceivable to allow --release-info=<file>,<encoding> or some such to allow non-UTF-8 as input and still do the right thing (Note: output would still be UTF-8). Perhaps it's not worth the trouble?

@AlanBateman
Copy link
Contributor

Question is if it's worth supporting arbitrary input encondings for --release-info <file> usages. <file> is currently in UTF-8 in the JDK build, so went with that expectation. But arguably it could be any other encoding a user chooses. So to make it generic, it's conceivable to allow --release-info=<file>,<encoding> or some such to allow non-UTF-8 as input and still do the right thing (Note: output would still be UTF-8). Perhaps it's not worth the trouble?

That seems unnecessary complexity, I think we should just keep it as UTF-8.

@RogerRiggs
Copy link
Contributor

Please fix the typo in the PR title. (I corrected the issue title).

@jerboaa jerboaa changed the title 8372155: RealeaseInfoPlugin doesn't handle input file as UTF-8 properly 8372155: ReleaseInfoPlugin doesn't handle input file as UTF-8 properly Nov 20, 2025
@jerboaa
Copy link
Contributor Author

jerboaa commented Nov 24, 2025

Any more input on this one? I'd appreciate a review. Thanks!

@jaikiran
Copy link
Member

Hello Severin, the changes this PR looks good to me. Should anything be specified/noted for the encoding of the key/value of the release-info plugin?

--release-info <key1>=<value1>

Copy link
Member

@jaikiran jaikiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wait for Alan's/Roger's review before integrating.

@jerboaa
Copy link
Contributor Author

jerboaa commented Nov 24, 2025

Should anything be specified/noted for the encoding of the key/value of the release-info plugin?

--release-info <key1>=<value1>

I don't think so. In my testing on Linux with en_US.UTF-8 LANG those are working fine:

$ ./jdk/bin/jlink --add-modules java.base --output ./build/testme --release-info 'add:myval=ööaßÄ oy'
$ grep myval ./build/testme/release 
myval=ööaßÄ oy
$ locale | grep LANG
LANG=en_US.UTF-8

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 24, 2025
Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All okay, just a minor comment on a var usage in the test.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Nov 24, 2025
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 24, 2025
@jerboaa
Copy link
Contributor Author

jerboaa commented Nov 24, 2025

/integrate

@openjdk
Copy link

openjdk bot commented Nov 24, 2025

Going to push as commit 42b108b.
Since your change was applied there have been 81 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 24, 2025
@openjdk openjdk bot closed this Nov 24, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 24, 2025
@openjdk
Copy link

openjdk bot commented Nov 24, 2025

@jerboaa Pushed as commit 42b108b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

4 participants